AITopics | augmented reality

Collaborating Authors

augmented reality

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Spatiotemporal Calibration and Ground Truth Estimation for High-Precision SLAM Benchmarking in Extended Reality

Shu, Zichao, Bei, Shitao, Li, Lijun, Chen, Zetao

arXiv.org Artificial IntelligenceDec-9-2025

Simultaneous localization and mapping (SLAM) plays a fundamental role in extended reality (XR) applications. As the standards for immersion in XR continue to increase, the demands for SLAM benchmarking have become more stringent. Trajectory accuracy is the key metric, and marker-based optical motion capture (MoCap) systems are widely used to generate ground truth (GT) because of their drift-free and relatively accurate measurements. However, the precision of MoCap-based GT is limited by two factors: the spatiotemporal calibration with the device under test (DUT) and the inherent jitter in the MoCap measurements. These limitations hinder accurate SLAM benchmarking, particularly for key metrics like rotation error and inter-frame jitter, which are critical for immersive XR experiences. This paper presents a novel continuous-time maximum likelihood estimator to address these challenges. The proposed method integrates auxiliary inertial measurement unit (IMU) data to compensate for MoCap jitter. Additionally, a variable time synchronization method and a pose residual based on screw congruence constraints are proposed, enabling precise spatiotemporal calibration across multiple sensors and the DUT. Experimental results demonstrate that our approach outperforms existing methods, achieving the precision necessary for comprehensive benchmarking of state-of-the-art SLAM algorithms in XR applications. Furthermore, we thoroughly validate the practicality of our method by benchmarking several leading XR devices and open-source SLAM algorithms. The code is publicly available at https://github.com/ylab-xrpg/xr-hpgt.

artificial intelligence, calibration, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2512.07221

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.69)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Multimodal "Puppeteer": Exploring Robot Teleoperation Via Virtual Counterpart with LLM-Driven Voice and Gesture Interaction in Augmented Reality

Zhang, Yuchong, Orthmann, Bastian, Ji, Shichen, Welle, Michael, Van Haastregt, Jonne, Kragic, Danica

arXiv.org Artificial IntelligenceDec-2-2025

The integration of robotics and augmented reality (AR) offers promising opportunities to enhance human-robot interaction (HRI) by making teleoperation more transparent, spatially grounded, and intuitive. We present a head-mounted AR "puppeteer" framework in which users control a physical robot via interacting with its virtual counterpart robot using large language model (LLM)-driven voice commands and hand-gesture interaction on the Meta Quest 3. In a within-subject user study with 42 participants performing an AR-based robotic pick-and-place pattern-matching task, we compare two interaction conditions: gesture-only (GO) and combined voice+gesture (VG). Our results show that GO currently provides more reliable and efficient control for this time-critical task, while VG introduces additional flexibility but also latency and recognition issues that can increase workload. We further explore how prior robotics experience shapes participants' perceptions of each modality. Based on these findings, we distill a set of evidence-based design guidelines for AR puppeteer metaphoric robot teleoperation, implicating multimodality as an adaptive strategy that must balance efficiency, robustness, and user expertise rather than assuming that additional modalities are universally beneficial. Our work contributes empirical insights into how multimodal (voice+gesture) interaction influences task efficiency, usability, and user experience in AR-based HRI.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2506.13189

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.67)
Health & Medicine > Health Care Technology (0.46)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

A Neurosymbolic Framework for Interpretable Cognitive Attack Detection in Augmented Reality

Chen, Rongqian, Andreyev, Allison, Xiu, Yanming, Chilukuri, Joshua, Sen, Shunav, Imani, Mahdi, Li, Bin, Gorlatova, Maria, Tan, Gang, Lan, Tian

arXiv.org Artificial IntelligenceDec-1-2025

Augmented Reality (AR) enriches human perception by overlaying virtual elements onto the physical world. However, this tight coupling between virtual and real content makes AR vulnerable to cognitive attacks: manipulations that distort users' semantic understanding of the environment. Existing detection methods largely focus on visual inconsistencies at the pixel or image level, offering limited semantic reasoning or interpretability. To address these limitations, we introduce CADAR, a neuro-symbolic framework for cognitive attack detection in AR that integrates neural and symbolic reasoning. CADAR fuses multimodal vision-language representations from pre-trained models into a perception graph that captures objects, relations, and temporal contextual salience. Building on this structure, a particle-filter-based statistical reasoning module infers anomalies in semantic dynamics to reveal cognitive attacks. This combination provides both the adaptability of modern vision-language models and the interpretability of probabilistic symbolic reasoning. Preliminary experiments on an AR cognitive-attack dataset demonstrate consistent advantages over existing approaches, highlighting the potential of neuro-symbolic methods for robust and interpretable AR security.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2508.09185

Country: North America > United States (0.28)

Genre:

Overview (0.66)
Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

A Virtual Mechanical Interaction Layer Enables Resilient Human-to-Robot Object Handovers

Faris, Omar, Tadeja, Sławomir, Forni, Fulvio

arXiv.org Artificial IntelligenceNov-26-2025

Abstract-- Object handover is a common form of interaction that is widely present in collaborative tasks. However, achieving it efficiently remains a challenge. We address the problem of ensuring resilient robotic actions that can adapt to complex changes in object pose during human-to-robot object handovers. We propose the use of Virtual Model Control to create an interaction layer that controls the robot and adapts to the dynamic changes in the handover process. Additionally, we propose the use of augmented reality to facilitate bidirectional communication between humans and robots during handovers. We assess the performance of our controller in a set of experiments that demonstrate its resilience to various sources of uncertainties, including complex changes to the object's pose during the handover . Finally, we performed a user study with 16 participants to understand human preferences for different robot control profiles and augmented reality visuals in object handovers. Our results showed a general preference for the proposed approach and revealed insights that can guide further development in adapting the interaction with the user . Human-to-robot object handover is a fundamental task that frequently occurs in collaborative manipulation.

artificial intelligence, human computer interaction, robot, (15 more...)

arXiv.org Artificial Intelligence

2511.19543

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.57)

Add feedback

Generative Augmented Reality: Paradigms, Technologies, and Future Applications

Liang, Chen, Zheng, Jiawen, Zeng, Yufeng, Tan, Yi, Lyu, Hengye, Zheng, Yuhui, Li, Zisu, Weng, Yueting, Shi, Jiaxin, Zhang, Hanwang

arXiv.org Artificial IntelligenceNov-24-2025

This paper introduces Generative Augmented Reality (GAR) as a next-generation paradigm that reframes augmentation as a process of world re-synthesis rather than world composition by a conventional AR engine. GAR replaces the conventional AR engine's multi-stage modules with a unified generative backbone, where environmental sensing, virtual content, and interaction signals are jointly encoded as conditioning inputs for continuous video generation. We formalize the computational correspondence between AR and GAR, survey the technical foundations that make real-time generative augmentation feasible, and outline prospective applications that leverage its unified inference model. We envision GAR as a future AR paradigm that delivers high-fidelity experiences in terms of realism, interactivity, and immersion, while eliciting new research challenges on technologies, content ecosystems, and the ethical and societal implications.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.16783

Country:

North America > United States (0.69)
Asia > China (0.46)

Genre:

Overview (1.00)
Research Report (0.81)

Industry:

Media (0.93)
Education > Educational Setting (0.92)
Information Technology > Services (0.67)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(4 more...)

Add feedback

AI Assisted AR Assembly: Object Recognition and Computer Vision for Augmented Reality Assisted Assembly

Kyaw, Alexander Htet, Ma, Haotian, Zivkovic, Sasa, Sabin, Jenny

arXiv.org Artificial IntelligenceNov-17-2025

We present an AI-assisted Augmented Reality assembly workflow that uses deep learning-based object recognition to identify different assembly components and display step-by-step instructions. For each assembly step, the system displays a bounding box around the corresponding components in the physical space, and where the component should be placed. By connecting assembly instructions with the real-time location of relevant components, the system eliminates the need for manual searching, sorting, or labeling of different components before each assembly. To demonstrate the feasibility of using object recognition for AR-assisted assembly, we highlight a case study involving the assembly of LEGO sculptures.

artificial intelligence, assembly, machine learning, (10 more...)

arXiv.org Artificial Intelligence

2511.05394

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.18)

Genre: Workflow (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

EEG-Driven AR-Robot System for Zero-Touch Grasping Manipulation

Wang, Junzhe, Xie, Jiarui, Hao, Pengfei, Li, Zheng, Cai, Yi

arXiv.org Artificial IntelligenceNov-10-2025

Reliable brain-computer interface (BCI) control of robots provides an intuitive and accessible means of human-robot interaction, particularly valuable for individuals with motor impairments. However, existing BCI-Robot systems face major limitations: electroencephalography (EEG) signals are noisy and unstable, target selection is often predefined and inflexible, and most studies remain restricted to simulation without closed-loop validation. These issues hinder real-world deployment in assistive scenarios. To address them, we propose a closed-loop BCI-AR-Robot system that integrates motor imagery (MI)-based EEG decoding, augmented reality (AR) neurofeedback, and robotic grasping for zero-touch operation. A 14-channel EEG headset enabled individualized MI calibration, a smartphone-based AR interface supported multi-target navigation with direction-congruent feedback to enhance stability, and the robotic arm combined decision outputs with vision-based pose estimation for autonomous grasping. Experiments are conducted to validate the framework: MI training achieved 93.1 percent accuracy with an average information transfer rate (ITR) of 14.8 bit/min; AR neurofeedback significantly improved sustained control (SCI = 0.210) and achieved the highest ITR (21.3 bit/min) compared with static, sham, and no-AR baselines; and closed-loop grasping achieved a 97.2 percent success rate with good efficiency and strong user-reported control. These results show that AR feedback substantially stabilizes EEG-based control and that the proposed framework enables robust zero-touch grasping, advancing assistive robotic applications and future modes of human-robot interaction.

artificial intelligence, human computer interaction, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.20656

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.36)
Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (0.35)

Add feedback

When Generative Artificial Intelligence meets Extended Reality: A Systematic Review

Ning, Xinyu, Zhuo, Yan, Wang, Xian, Sio, Chan-In Devin, Lee, Lik-Hang

arXiv.org Artificial IntelligenceNov-6-2025

With the continuous advancement of technology, the application of generative artificial intelligence (AI) in various fields is gradually demonstrating great potential, particularly when combined with Extended Reality (XR), creating unprecedented possibilities. This survey article systematically reviews the applications of generative AI in XR, covering as much relevant literature as possible from 2023 to 2025. The application areas of generative AI in XR and its key technology implementations are summarised through PRISMA screening and analysis of the final 26 articles. The survey highlights existing articles from the last three years related to how XR utilises generative AI, providing insights into current trends and research gaps. We also explore potential opportunities for future research to further empower XR through generative AI, providing guidance and information for future generative XR research.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1080/10447318.2025.2565392

2511.03282

Country: Asia > China (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

This program is using augmented reality to teach preschoolers spatial awareness

Los Angeles TimesSep-23-2025, 10:00:00 GMT

Things to Do in L.A. Tap to enable a layout that focuses on the article. A child uses a tablet to play an augmented reality game meant to teach spatial awareness. This is read by an automated voice. Please report any issues or inconsistencies here . Spatial thinking concepts are a part of early math that have largely been absent from preschool curricula.

augmented reality, lewis presser, teach preschooler spatial awareness, (11 more...)

Los Angeles Times

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.07)
North America > United States > New York (0.05)
North America > United States > Massachusetts (0.05)
(3 more...)

Industry:

Health & Medicine (1.00)
Media (0.96)
Government > Regional Government (0.70)
(4 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (0.69)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.63)

Add feedback

Perception Graph for Cognitive Attack Reasoning in Augmented Reality

Chen, Rongqian, Hong, Shu, Islam, Rifatul, Imani, Mahdi, Tan, G. Gary, Lan, Tian

arXiv.org Artificial IntelligenceSep-9-2025

Augmented reality (AR) systems are increasingly deployed in tactical environments, but their reliance on seamless human-computer interaction makes them vulnerable to cognitive attacks that manipulate a user's perception and severely compromise user decision-making. To address this challenge, we introduce the Perception Graph, a novel model designed to reason about human perception within these systems. Our model operates by first mimicking the human process of interpreting key information from an MR environment and then representing the outcomes using a semantically meaningful structure. We demonstrate how the model can compute a quantitative score that reflects the level of perception distortion, providing a robust and measurable method for detecting and analyzing the effects of such cognitive attacks.

artificial intelligence, arxiv preprint arxiv, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2509.05324

Country:

North America > United States > District of Columbia > Washington (0.16)
North America > United States > Pennsylvania (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry:

Leisure & Entertainment > Games > Computer Games (0.47)
Government > Military (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.97)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.74)

Add feedback